APPARATUS AND METHOD FOR PROCESSOR POWER 
MEASUREMENT IN A DIGITAL SIGNAL PROCESSOR 
USING TRACE DATA AND SIMULATION TECHNIQUES 

Background of the Invention 

1. Field of the Invention 

This invention relates generally to digital signal processing units and, more 
particularly, to techniques for determining the power consumption of digital signal 
processor units. 

2. Background of the Invention 

The digital signal processor and related devices have found increasing application 
in portable apparatus, such as cell phones, wireless internet devices, etc. The power 
consumption is a critical parameter in portable apparatus. The power consumption 
determines the size of the battery and the time between recharging the battery, key 
parameters in the portability of devices. 

However, the power consumption parameter for the digital signal processor has 
several variables. The hardware implementing the device can, for example, be designed 
to run with minimum power expenditure. Even after every effort has been employed to 
reduce to power requirements of the hardware, the software programs may not be power 
efficient. Individual programs can be optimized to provide minimum power 
consumption. In addition, not only can the central processing unit draw power, but bus 
activity can also result in the consumption of power. However, before these parameters 
can be optimized, a technique for the measurement of the power consumption must be 
provided. 
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In designing and testing a central processing unit, a simulation model is provided 
for the proposed design. Using the simulation model, a simulation of the processing 
activity can be performed for the central processing unit, i.e., for a set of input signals and 
the set of output signals. Even the internal operation of the data processing unit can be 
determined from the simulation model. The simulation model allows design changes and 
improvements to be investigated in central processing unit without the lengthy process of 
fabricating the apparatus. 

Referring to Fig. 1, a process of designing and fabricating a central processing 
unit is summarized. Based on a series of requirements for a central processing unit and 
based on characteristics of technology used in implementing a central processing unit, a 
simulation model is prepared in step 10. The simulation model simulates the physical 
electrical parameters of a physically-implemented central processing unit. Using the 
simulation model, the operation of the simulation model is tested and the model is refined 
in step 11. Any problems identified this stage are typically resolved in an updated 
simulation model. Because of the time required actually to fabricate a central processing 
unit, any problems that can be identified and resolved at this stage provides a big impact 
on the schedule for providing a functioning central processing unit. When a final version 
of the simulation model has been achieved, then a physical central processing unit is 
fabricated using the simulation model as template in step 12. The implemented central 
processing unit is tested in step 13. In step 14, the testing of the central processing unit is 
examined to determine if changes are necessary to the central processing unit design and, 
consequently, to the simulation model. When no changes are needed, the process ends in 
step 16. When changes in the central processing unit are required, the process proceeds 
to step 15 wherein the simulation model is modified. After the modifications are 
completed, the process returns to step 11 wherein the simulation model is tested and 
refined. 

However, the simulation models have limitations that become apparent when the 
central processing unit is fabricated. In order to test and verify the operation of the 
implemented central processing unit, selected signals can be retrieved from the central 
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processing unit and other selected signals applied to the central processing unit. By way 
of specific example, the JTAG (Joint Test Action Group) protocol identifies specific 
signals for application to the central processing unit and retrieval from the central 
processing unit. The purpose of this protocol is to standardize the signals for 
convenience in testing and debug processes. The signals of the JTAG protocol as well as 
trace signals can used in the testing and debug processes. The central processing unit 
typically has a trace port dedicated to exchange of the trace signals between selected 
components in the central processing unit and a trace unit. The trace unit is programmed 
to interpret the trace signals received from the central processing unit. While the JTAG 
protocol has been an improvement in the tools available to the designer and developer of 
both the central processing unit and the programs that control the operation of the central 
processing unit, recently, the number of trace signals has been greatly expanded, i.e., 
relative to the number of JTAG protocol signals. The additional signals have been 
particularly useful in obtaining information about the internal state of the central 
processing unit. 

One of the most important applications of the data processing technology has be 
to battery-operated portable devices, for example, hand-held appliances. In these 
applications, the requirement is that the power consumption be as low as possible. The 
devices have been designed for minimum power operation. One further parameter in the 
reduction of power consumption is the program controlling the operation of the data 
processing unit. When initially developed, the program is typically not optimized for 
power consumption. However, several variations in a program may be possible when an 
attempt is made to reduce power consumption in a program. 

A need has therefore been felt for apparatus and an associated method having the 
feature that the power consumption of a central processing unit of a digital signal 
processing system can be measured as the result of execution of a program. The 
apparatus and associated method would further have the feature that the power 
consumption of the program could be related to the individual steps in the program. The 
apparatus and associated method would still further have the feature the power consumed 
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by the central processing unit can be determined for the individual clock cycles during 
the execution of the program. 

Summary of the Invention 

The aforementioned and other features are accomplished, according to the present 
invention, by executing the activity for which power consumption is to be optimized in a 
central processing unit and using the signals collected from the central processing unit to 
execute the same activity on the simulation model. Trace components collect and store in 
memory the input signals to and the output signals from the central processing unit for 
each clock cycle. The signals collected are sufficient to recreate the activity of the central 
processing unit in a simulation model when the initial states are the same. The recorded 
set of input and output signals are applied to a simulation model of the central processing 
unit. The input signals and the output signals permit the state of the central processing 
unit to be determined for each clock cycle when applied to the simulation model from the 
equivalent initial state. Using the simulation model, the power dissipated for each state 
of the data processing unit can be determined. Therefore, using the input and output 
signals to determine the state of the central processing unit for each clock cycle, the total 
power used by the program can be calculated. By relating the power consumed as a 
function of the execution of the program, those portions of the program consuming the 
most power can be examined to determine whether the power being consumed can be 
minimized. 

Brief Description of the Drawings 

Figure 1 is a flow chart illustrating the use of a simulation model to design and 
test a central processing unit according to the related art. 

Figure 2 is a block diagram of data processing system capable of using the present 
invention according to the present invention. 
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Figure 3 is flow chart describing the use of the trace signals and trace components 
and a simulation model to determine the power consumption by a central processing unit 
according to the present invention. 

Description of the Preferred Embodiment 

1. Detailed Description of the Figures 

Fig. 1 has been described with respect to the prior related art. 

Referring to Fig. 2, a block diagram of data processing system capable of 
advantageously using the present invention is shown. The digital signal processing unit 

20 includes central processing unit 21, a plurality of peripheral units 22A through 22N, a 
memory unit 23, and buffer unit 24. The peripheral devices 22A through 22N can have 
interface units that are coupled to devices external to the chip 20. An internal bus 25 
couples the peripheral devices 22A through 22N, the memory unit 23 and the buffer unit 
24 to the central processing unit 21. The buffer unit 24 serves as an interface unit 
between the internal bus 24 and an external bus 25. The central processing unit 21 
furthermore includes a trace port 27. The trace port 27 provides a coupling between 
selected leads within the central processing unit 21 and the trace unit 28. The trace unit 

28 can provide an analysis of the trace signals received from the central processing unit 

21 and can determine the read data, cycle by cycle stalls, and the instruction sequence to 
be applied to the central processing unit 21. The trace unit 28 records traced data in the 
memory unit 29. The trace memory unit 29 records central processing unit-related 
activity. A processing unit 27 has access to the memory unit 29 storing the results of the 
trace acquisition and memory unit 26 storing the simulation model. The storage of the 
simulation model can also include the storage of parameters identifying the power 
dissipated for each central processing unit state transition. As will be clear, memory unit 

29 and memory unit 26 can be different portion of the same memory unit. The processor 
27 applies the trace acquisition results to the simulation model as will be described 
below. 
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Referring to Fig. 3, the process for minimizing the power consumption during the 
execution of a software program is shown. In step 30, the simulation model is developed 
for the central processing unit executing the program. Typically, the simulation model is 
generated during the design of the central processing unit. In step 31, using the 
simulation model, the power dissipated for each state of the central processing unit is 
determined. In step 32, save the initial state of the central processing unit. In step 33, the 
program being tested with respect to power consumption is executed on the central 
processing unit. Using the trace components, the input and output signals are determined 
for each clock cycle in step 34. In step 35, the state of the simulation model is initialized 
to the initial state of the central processing unit. In step 36, the input and output signals 
and the stall events are applied to the simulation model as described herein. The actual 
central processing unit generating the trace data and the simulation model (of the same 
central processing unit) are viewed as identical finite state machines. The input signals to 
the real central processing unit (read data and machine stalls) are applied to the 
simulation model, the simulation model being a second finite state machine. When the 
two state machines are started from the same state, they progress through the same 
sequence of states. The trace data also includes program counter information. The 
program counter data is used to detect the occurrence of an interrupt in the instruction 
processing. As a result of an interrupt process, the state progression of the simulation 
model and the corresponding central processing unit can differ. The program counter 
trace data is used to override the program counter of the simulation model thereby 
keeping the two finite state machines synchronized. Using the simulation model, the 
state of the central processing unit can be determined for all of the program execution. In 
step 37, the state of the central processing unit, as determined in step 36 is correlated with 
the power dissipation for the related state as determined in step 31. As a result of the 
correlation in step 37, the power consumed as a function of program portion being 
executed can be determined in step 38. In step 39, the program and the power dissipation 
for the related portions of the program can be reviewed to determine whether the program 
can be adjusted to reduce power. 
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2. Operation of the Preferred Embodiment 

5 

The present invention relies on the fact that, with the emerging test and emulation 
technology, detailed information can be obtained about the operation of a data processing 
system. The invention also relies on the procedure that the development of the data 
processing system requires a detailed simulation model. From the simulation model, an 
10 estimate of the power being dissipated for each state of central processing unit can be 
determined. When the program under investigation is executed by the central processing 
*0 unit, the trace components can be used to determine all of the input signals (read data) 

ru 

applied to the central processing unit and the output signals be generated by the central 
zi processing unit. The applied signals and the generated signals indicating the precise 

Q 15 point in the program at which an interrupt was taken resulting from the execution of the 
q program are applied to the simulation model. The simulation model identifies a state 

defined by the applied and generated signals. As indicated above, the simulation model 
can be used to estimate the power consumed for each state. 

20 Thus the power consumed for each state is known as well as the progression of 

the states during the execution of the program. The power consumed for the progression 
of states can be correlated to the executing program. Therefore, the power dissipated as a 
function of the program can be determined. Areas of exceptionally high power 
consumption of the executing program can be determined and analysis of the code can be 
25 performed to determine whether an alternative code strategy can be employed to reduce 
the power dissipation. 

While the invention has been described with respect to the embodiments set forth 
above, the invention is not necessarily limited to these embodiments. Accordingly, other 
30 embodiments, variations, and improvements not described herein are not necessarily 
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excluded from the scope of the invention, the scope of the invention being defined by the 
following claims. 



TI-33319 



Page 8 



